[ZEPPELIN-3986]. Cannot access any JAR in yarn cluster mode#3308
Conversation
|
|
||
| InterpreterResult result = interpreter.interpret("import com.databricks.spark.avro._", getInterpreterContext()); | ||
| assertEquals(InterpreterResult.Code.SUCCESS, result.code()); | ||
| } |
There was a problem hiding this comment.
We can not test user jar case in unit test (MutableURLClassLoader only exist in real spark app which is launched via spark-submit) . So remove it here.
felixcheung
left a comment
There was a problem hiding this comment.
why is kafka-clients-0.11.0.3.jar in resource? I think we need to avoid binaries in repo (esp this is a big one)
| .filter { u => u.getProtocol == "file" && new File(u.getPath).isFile } | ||
| // Some bad spark packages depend on the wrong version of scala-reflect. Blacklist it. | ||
| .filterNot { u => | ||
| Paths.get(u.toURI).getFileName.toString.contains("org.scala-lang_scala-reflect") |
There was a problem hiding this comment.
can we make this configurable?
There was a problem hiding this comment.
User jar is configurable via spark.jars or spark.jars.packages. Here's the internal mechnism at runtime of detecting what user jars user has been specified
There was a problem hiding this comment.
I mean the org.scala-lang_scala-reflect
also indentation seems off
|
@felixcheung kafka jar is for unit testing. https://github.com/apache/zeppelin/pull/3308/files#diff-338173a3721f4282e638ac5145b4bfe7R88 |
|
We really shouldn’t include binary jar in the source code repo.
Other projects do so by building a test jar in test from source.
|
|
@felixcheung I have removed the kafka jar and use the zeppelin-interpreter-integration jar |
| .filter { u => u.getProtocol == "file" && new File(u.getPath).isFile } | ||
| // Some bad spark packages depend on the wrong version of scala-reflect. Blacklist it. | ||
| .filterNot { u => | ||
| Paths.get(u.toURI).getFileName.toString.contains("org.scala-lang_scala-reflect") |
There was a problem hiding this comment.
I mean the org.scala-lang_scala-reflect
also indentation seems off
| private void testInterpreterBasics() throws IOException, InterpreterException, XmlPullParserException { | ||
| // add jars & packages for testing | ||
| InterpreterSetting sparkInterpreterSetting = interpreterSettingManager.getInterpreterSettingByName("spark"); | ||
| sparkInterpreterSetting.setProperty("spark.jars.packages", "com.maxmind.geoip2:geoip2:2.5.0"); |
There was a problem hiding this comment.
use something in org.apache.zeppelin instead?
There was a problem hiding this comment.
It may already shipped to spark driver and put under classpath. So it is better to use some other libraries.
|
Will merge if no more comments |
|
Hi @zjffdu, Many thanks. |
|
Both master & 0.8.2 |
|
Thakns @zjffdu, export SPARK_SUBMIT_OPTIONS="--jars "I clone the latest git and checkout to 0.8 so it has this pull request merged already. |
|
On top of this, |
What is this PR for?
User specified jars is missing in yarn-cluster mode due to we didn't detect the user jar correctly. This PR fix the detecting jar logic in
BaseSparkScalaInterpreter.What type of PR is it?
[Bug Fix]
Todos
What is the Jira issue?
How should this be tested?
spark.jars&spark.jars.packagesScreenshots (if appropriate)
Questions: